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ABSTRACT 

We investigate the evolution of the galaxy two point correlation function (CF) over a 
wide redshift range, 0.2 < z < 3. For the first time the systematic analysis covers the 
redshifts above 1 — 1.5. The catalogue of ~ 250000 galaxies with i + < 25 and known 
photometric redshifts in the Subaru Deep held is used. The galaxies are divided into 
three luminosity classes and several distance/redshift bins. First, the 2D CF is de¬ 
termined for each luminosity class and distance bin. Calculations are based on the 
quantitative differences between the surface distributions of galaxy pairs with com¬ 
parable and distinctly different photometric redshifts. The power law approximation 
for the CF is used. A limited accuracy of photometric redshifts as compared to the 
spectroscopic ones has been examined and taken into account. Then, the 3D functions 
for all the selected luminosities and distance are calculated. The power law parameters 
of the CF, the slope and the correlation length, are determined. Both parameters do 
not show strong variations over the whole investigated redshift range. The slope of the 
luminous galaxies appears to be consistently steeper than that for the fainter ones. 
The linear bias factor, b(z), grows systematically with redshift; assuming the local 
normalization 6(0) ss 1.1 — 1.2, the bias reaches 3 — 3.5 at the high redshift limit. 

Key words: galaxies: distances and redshifts - galaxies: statistics - galaxies: struc¬ 
ture. 


1 INTRODUCTION 

Growth of galaxy structures provides an essential infor¬ 
mation on the evolution of the dark matter distribution 
(Marulli et al. 20i3 and references therein). Observations of 
the large scale structures would - possibly - give insight also 
into the very nature of dark energy (e.g Jenkins et al. 1998 


plitude ratio of these fluctuations, known as the galaxy bias, 
is roughly linear, although it depends on the smoothing scale 


|Jennings et al. 2011 Huterer et al. 2015). Gravitationally 
dominating dark matter induces growth of the density fluc¬ 
tuations that eventually lead to the formation of galaxies. 
From that moment on, the large scale matter distribution 
generated in the computer simulations becomes, at least po¬ 
tentially, subject to the observational constraints. 

However, distinctly different physical properties of the 
collisionless dark matter and the visible, baryonic matter 
make the interplay between those constituents intricate. 
Flows of baryonic matter towards gravitational wells created 
by concentrations of the dark matter involve complex pro¬ 
cesses of gas accretion, shock heating, and radiative cooling. 
In effect, the observable galaxy structures do not follow ex¬ 
actly the dark matter distribution. The relationship between 
the density fluctuations of the galaxies and dark matter was 
formulated on statistical grounds by|Kaise~r (1984|. The am- 
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(Mo & White 1996). The bias is also a function of time 
(e.g. Fry| 19961. Moreover, the galaxy clustering depends on 
galaxy luminosity, mass, colour, and other parameters. In 
the local Universe these relationships have been extensively 
investigated (e.g. Coupon et al. 2012; see also [Marulli et al. 


2013 for the comprehensive reference list). 


Complex dark matter structures generated by the grav¬ 
itational instability revealed in the cosmological simulations 
combined with the obscure nature of the dark matter itself 
open a space for models that describe the distribution of lu¬ 
minous matter with the dark one (e.g. |Berlind fe Weinberg| 
2002 Papageorgiou et al. 2012]|. The study of the luminous 


dark matter relationship incorporates two separate issues: 
adequate statistical description of the galaxy distribution 
and comprehensive cosmological simulations of dark matter. 
Both determined over a wide redshift range. In the present 
paper we concentrate on the galaxy distribution. This ques¬ 
tion has been investigated extensively for a long time and in 
recent years gained momentum mostly as a result of massive 
automated galaxy surveys. Although many characteristics 
of the galaxy clustering are precisely measured at selected 
magnitudes and/or redshifts, the evolution of the clustering 
over a wide redshift range is still not well determined. In the 
present paper we pursue this question, i.e. to what extent 
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the observational data constrain parameters of the cosmic 
evolution of the galaxy clustering over the whole observable 
cosmic time. 

In most cases questions related to the galaxy cluster¬ 
ing are adequately addressed using the correlation functions 
(CFs). Usually the observed function is satisfactory repre¬ 
sented by a power-law with two fitted parameters - the 
correlation length and slope. Only the high-quality statis¬ 


tical material allows for more detailed study (e.g. Baugh 
|1996||Norberg et al.|2002{|Martinez-Manso et al.|2015 l. Here 

we apply the power-law model for two reasons. First, the 
present data do not allow for the more refined analysis, and 
second, our method works efficiently using this approxima¬ 
tion. Additionally, we believe that a question of the cluster¬ 
ing evolution is still adequately addressed by the search for 
variations with time of the power-law parameters. A large 
number of investigations in the recent years examined how 
the shape of the galaxy CF depends on the overall galac¬ 
tic parameters such as stellar mass, luminosity, type, colour 
or star formation rate. In the local universe and at low to 
moderate redshifts, all these studies indicate that the am¬ 


plitude of the CF increases with luminosity (e.g. Norberg 


etTaL| 2002:; Polio et al. 2006; Coil et al. 2006; Li et al. 2006 

Wake et al. 

2011 

Zehavi et al. 

2011 

Marulli et al. 

2013), 


et al. 


et al. 


2009), or limited to specific types of galaxies (e.g. Lin 


2012 


Mostek et al.|2013). On the other hand, several 


studies indicate little or no dependence of the correlation 


slope on galaxy parameters (e.g. Norberg et al. 2002 Coil 
et al.|2006 Marulli et al.|2013 l, but see Polio et al. (20061. 

If no information on the galaxy radial distance is avail¬ 
able, the amplitude of spatial correlations is derived from 
the 2D CF. Efficiency of this approach is limited to rela¬ 
tively shallow galaxy samples. This is because in the deep 
galaxy surveys the correlation signal is diluted by a large 
number of random coincidences. Galaxy catalogues with red- 
shifts offer a natural method to eliminate most of the ran¬ 
dom coincidences, what substantially improves the signal to 
noise (S/N) ratio. Unfortunately, acquisition of the spectro¬ 
scopic redshifts is time consuming and restricted to rela¬ 
tively bright objects. 

The photometric redshifts, as compared to the spec¬ 
troscopic ones, are substantially less accurate distance in¬ 
dicators. Nevertheless, they provide at least raw estimate 
of the galaxy position. Thus, photometric redshifts could be 
used to identify and extract evident random pairs, and to in¬ 
crease in this way the S/N ratio of the correlation amplitude 
measurement. The photometric redshifts were successfully 
used in the past for the investigations of the angular clus 


tering (Heinis et al.|2007 McCracken et al()2008 Wake et al. 


2011), and the spatial clustering ( Arnalte-Mur et al.||2014 
Bielby et al.|2014 l. In the present paper we apply a different 

technique to obtain the galaxy CF for a wide range of red¬ 
shifts, not covered by previous investigations. We use the 

2 deg * 2 COSMOS photometric redshift catalogue by 


Ilbert 


et al. (2009) available through the Web site of IPAC/IRSA. 


According to a standard procedure, to determine the 2D 
CF one generates a large set of randomly distributed points. 
Properly normalized numbers of galaxy-random point pairs 
is then used as a reference distribution of pair separations 
representing a random galaxy population. A comparison of 
the physical galaxy-galaxy pairs with the galaxy-random 


point pairs allows one to assess fluctuations of the galaxy 
distribution and eventually to calculate the correlation sig¬ 
nal. The efficiency of this method is highly sensitive to the 
interference with the cosmic signal of various selection ef¬ 
fects related to the data processing. For instance, even a 
minute variations in the catalogue depth or the image qual¬ 
ity result in fluctuations of the surface density of objects, 
that could be easily misidentified with the galaxy clustering. 
To minimize these kind of confusions, the ’random’ points 
should be distributed in such a way as to mimic all the in¬ 
homogeneities of the non-cosmic origin. The performance of 
this widely applied procedure depends on how precisely such 
mock catalogues are free from all the observational biases. 

To reduce the instrumental bias, we apply here a differ¬ 
ent attitude. We assess the number of galaxy pairs expected 
for the random distribution by means of the galaxy-galaxy 
pairs with sufficiently different photometric redshifts, that 
effectively exclude physical connection. Distribution of such 
pairs incorporates most signatures associated with the data 
bias and processing, while it is free from the physical clus¬ 
tering signal. 

The organization of the paper is as follows. In the next 
section, we give a short description of the COSMOS pho¬ 
tometric redshift catalogue. In Section [3] we describe the 
details of the present method to calculate the 2D CF, derive 
the relevant formulae and present results of these calcula¬ 
tions. Formulae applied to determine the 3D CF for differ¬ 
ent luminosities and over a several redshift bins are given in 
Sect. [4] Also statistical properties of the photometric red¬ 
shift measurements are described in this section. The evolu¬ 
tion of bias in the linear model, and short comparison of our 
measurements on the CF evolution with the selected previ¬ 
ous results is presented in Sect. [5] Strong and weak points 
of our method are summarized here. 

In the paper we consistently parameterize the COS¬ 
MOS catalogue data and results of the investigation using 
the comoving distances alongside the redshift. To convert 
redshifts to the comoving distances, we use the flat cosmo¬ 
logical model with H a = 70kms _1 Mpc , S2 mj0 = 0.3 and 
Da,o = 0.7. 


2 THE DATA 


The COSMOS photometric redshift catalogue is presented 


in detail by Ilbert et al. (2009). Here, we give only the in¬ 


formation relevant to the present investigation. The cat¬ 
alogue contains 385065 objects in the deep Subaru Area, 
of which almost 252000 have been classified as galaxies 
brighter than ?.j( B = 25. The galaxies are distributed within 
a square of 84arcmin a side centred at a c = 150? 1 and 
5 C = 2?2 (Taniguchi et al.|2007). However, the data cover¬ 
age is nonuniform on various angular scales. In Fig. |7] the 
distribution of all the galaxies is shown. One can see a large 
number of masked areas of poor image quality (mostly sat¬ 
urated star images and CCD related problems). 

To measure the photometric redshifts Le Phare 0 
(Arnouts & Ilbert 20111 algorithm was applied using 30 
filters that cover UV, optical and NIR. According to m 
bert et al. (2009), the photo -2 accuracy depends on galaxy 


www.cfht.hawaii.edu/~arnouts/LEPHARE/DOWNLOAD/lephare_doc.pdf 
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Figure 1. Distribution of ~ 250000 COSMOS galaxies. An un¬ 
even data coverage affects the galaxy correlation analysis. 


Redshift 

0.2 0.5 1 2 3 4 



Figure 2. The spatial density of the COSMOS galaxies as a 
function of the comoving distance (bottom abscissa) or redshift 
(top abscissa). Highly fluctuating (dotted) curve represents the 
moving average calculated within ±50 Mpc, while the smoother 
one (solid) - within ±500 Mpc. 


redshift and magnitude, and it is suitably characterized by 
o-Az/(i+z s ) defined as 1.48 x median(|2 p — z s |/(l±2 s )), where 
z P and z s denote photometric and spectroscopic redshift, re¬ 
spectively. For < 22.5 a dispersion <xaz/(i+z s ) = 0.007, 
while for < 24 and z < 1.25, ctaz/(i+z s ) — 0.012. At 
fainter magnitudes the estimates are less accurate, e.g. for 
*A B ~ 24 and z ~ 2, cr Az /( 1+ , s ) = 0.06. 

Although the distribution of differences (z p — 2 s )/( 1 + 
2 S ) is roughly fitted by a Gaussian function, the median 
statistics in the <taz/(i+z s ) definition above indicates that 
gross errors affect occasionally the z p estimates. |IIbert et al.| 
(2009) define a ‘catastrophic failures’ of the z p estimate if 
\z p — 2 s|/(l + 2 s ) > 0.15. In the case of bright galaxies (f AB < 
22.5), the fraction of catastrophic failures amounts to 0.7 
percent. It rises however, to 20 percent for galaxies at 1.5 < 
2 S < 3. The median apparent magnitude of those galaxies 


The distribution of z p over z s exhibits also some pe¬ 
culiarities that are not depicted by uaz/(i+z s )- The z p — z a 
differences are correlated with z s over scales comparable and 
larger than the quoted above dispersion oaz/(i+z s )- In the 
right-hand panel of fig. 10 in Il bert et al.| (|2009) the distri¬ 
bution of z p versus z s exhibits systematic, to some extent 
coherent variations around the line z p = z s , indicating the 
non-random, large-scale deviations between z p and z s . The 
effect of the non-random character of z p deviations from the 
spectroscopic data is graphically demonstrated in Fig. [2j 
where we plot a moving average density of galaxies in the 
COSMOS catalogue. We count galaxies within 50 Mpc of the 
radial distance for each object in the catalogue. The z p red- 
shifts are used in the calculations. Number of neighbours is 
then used to assess the local spatial concentration of galax¬ 
ies as a function of distance. Apparent quasi-periodic oscil¬ 
lations around the average galaxy density in the distance 
range ~ 2000 through ~ 7000 Mpc with the characteristic 


Redshift 



Figure 3. The distribution of galaxies in the absolute Vj mag¬ 
nitude - comoving distance plane. Inclined dashed lines indicate 
luminosity sub-samples A-D, while the vertical lines define dis¬ 
tance bins used in the calculations. Full dots - see text. 


length of ~ 330 Mpc demonstrate the large-scale inhomo¬ 
geneities in the z p measurements. 

Both the z p inaccuracies and the sky coverage discon¬ 
tinuities introduce multiple biases in the COSMOS galaxy 
data that distort the actual spatial structures. Although it 
is feasible to construct the mask that would eliminate (all ?) 
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the area not covered by the COSMOS catalogue, the re¬ 
maining surface nonuniformities potentially present in the 
data would persist. In the next section we present a practi¬ 
cal method how to isolate the real clustering signal from all 
the non-cosmic effects. 

The sample spans a wide range of absolute magnitudes. 
In Fig. [3] the absolute Vj magnitudes (Ilbe rt et al.| [2009) 
are plotted against the distance. Apart from the clear ef¬ 
fects introduced by the z p distortions, an inclination of the 
high-luminosity envelope defines a rate of the average galaxy 
luminosity evolution. Black dots in Fig. [3] show the magni¬ 
tudes of 10 th ranked galaxy in 500 Mpc bins between 500 
and 7000 Mpc. Although in the course of the cosmic evolu¬ 
tion galaxy luminosities are subject to more complex varia¬ 
tions, we adopt here this relationship as the luminosity evo¬ 
lution of the general galaxy population. We divide the data 
into absolute magnitude classes using lines of fixed slope 
roughly consistent with the slope of the I/j(10) distance 
relationship (see Fig. 0. The dashed lines in the figure de¬ 
fine four luminosity sub-samples. The samples A-C contain 
~ 70000 galaxies each, while the sample D - around half 
of this amount. Because the D sample is limited to redshifts 
smaller than 0.5, the present analysis of the clustering evolu¬ 
tion concentrates on samples A-C. Although the rate of the 
luminosity evolution adopted here is linear in the comoving 
distance, it is in good agreement with the evolution model 
(linear in redshift) adopted by Marulli et al. (20131 in the 
redshift range of 0.5-1.1. 

Moving to larger distances all the galaxy sub-samples 
become increasingly incomplete (Fig. [ 2 ] for the whole data, 
and Fig [lO] for the class A). This effect should not coerce 
the measurements of the CF if the magnitude selection at 
a given distance is uncorrelated with the local galaxy space 
density, and this is assumed in the subsequent calculations. 


3 2D CORRELATION FUNCTIONS 


We take the photometric redshifts, z p , as a working estimate 
of the comoving distances for all the galaxies. This is a le¬ 
gitimate assumption for the majority of galaxies. According 
to Ilbert et al. (2009), only a (small) fraction of z p differs 


from z s by more than 0.15 • (1 + z s ). Nevertheless, in the 
present derivation we explicitly take into account a ques¬ 
tionable nature of the individual distance estimate. This is 
because even a modest fraction of catastrophic errors affects 
the present analysis (see below). 

Clearly, the distance estimates based on z p are too 
coarse to measure directly the spatial clustering. Neverthe¬ 
less, the z p data are valuable in measurements of the galaxy 
clustering at different distances using the 2D CF. Let us 
to consider the expected distribution of galaxies around the 
galaxy selected at a distance d = d(z p ). The expected num¬ 
ber of galaxies within a solid angle Au> at the angular dis¬ 
tance 9 from the selected galaxy, N(0), is described by the 
2D CF w(6\d): 


N(6) = Auj [n Q ■ w(9 \ d) + n 0 \. 


(1) 


where n 0 is the galaxy surface density expected in the ab¬ 
sence of cosmic clustering. For the perfect data n 0 is equal 
to the mean galaxy surface density. However, the present ob¬ 
servational material reveals numerous defects that interfere 


with the sky fluctuations. In effect, the local ‘background’ 
galaxy density, n 0 , is not constant but depends on the posi¬ 
tion of selected galaxy and the separation 9. 

For the data spanning a large distance range as in 
the case of the COSMOS catalogue, equation 0 is of a 
limited use because the excess of neighbours, 8N(9) = 
N(9) — Alj ■ n 0 , even at small separations 9, is tiny as com¬ 
pared to Au; ■ n 0 . To improve the S/N ratio, which is of the 
order of 8N(9)/ V / N{9), we divide the whole galaxy sample 
into into six distance bins between 650 and 6550 Mpc. The 
radial depth of each bin is larger than 750 Mpc. Thus, it is 
also much larger than the maximum distance at which the 
CF differs from 0. 

Calculating the surface correlations of galaxies within a 
selected bin, we split the whole galaxy population into two 
classes. Class I contains galaxies located in the bin of the 
selected galaxy, while the class II contains all the galaxies in 
other bins. The expected average excess of galaxies, Ni(9) 
or the surface density profile, ni(9), of class I galaxies in the 
vicinity of the galaxy drawn from the same distance bin is 
described by the CF wi(0): 

= ni (0) = m 0 • wi(9) + nio , (2) 

Add 

where nio is the class I galaxy surface density expected for 
the non-clustered case. It is subject to various observational 
constraints and it may vary alike n D . We now assume a power 
law for the ioi($) function separately for each bin: 

wi(9) = Wi 9 Ci , i= 1, 6. (3) 


In the absence of all the effects involved in the data acqui¬ 
sition that hamper the genuine sky distribution of galaxies, 
one could use directly equations 0 and 0 to determine 
the amplitude, Wi, and slope, (ji. However, in the real data 
the ‘average’ galaxy density, ni Q , is not well defined. One 
way to eliminate effects of the ni Q fluctuations is to use the 
surface distribution of class II galaxies, nn 0 . We assume that 
intruding (non-cosmic) factors that affect the surface distri¬ 
bution of class I galaxies modify also distribution of class II 
galaxies. Although a response of both galaxy population to 
various effects may not be identical, we assume that the ratio 
rji = nio/niio is much more immune to observational biases 
than m 0 and nn 0 separately. Thus, dividing equation 0 by 
niio, we get the numerically tractable equations: 


Nt(9) 
An (6») 


= rji Wi9 Ci + rji , 


(4) 


where N\\(9) = Acj • nn 0 is the average number of class II 
galaxies at the distance 9 from the randomly chosen galaxy 
of class I. As an estimator of Ni(9) and Nu(9) for the 
given distance bin we use the total numbers of the class I 
galaxy pairs and the class I-class II pairs, respectively, both 
summed over the entire field. 

To assess the power-law parameters of the angular CF, 
the present method applies the pair count ratio ni 0 /nn 0 ra¬ 
tio rather than just the pair counts nio, This scheme deals 
simultaneously with two questions. First, it accounts in a 
simple way for all the masked out areas. Second, if there 
are some unrecognized systematic effects that perturb the 
surface galaxy distribution, they are absorbed by the r/i pa¬ 
rameter while the pure clustering signal is represented by 
the power law. 
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Figure 4. The surface correlation functions for the luminous 
galaxies (sample A) in the six distance bins (labelled in the bot¬ 
tom left corner), the distance/redshifts boundaries are indicated 
in the top right corner. The data points, lcr error bars, and the 
power-law least-squares solutions of equation 0 are shown. 



Figure 5. The 2D correlation function slopes: full points - lumi¬ 
nosity class A, open circles - class B, triangles — class C. 


The parameters Wi , ^ and rji for each distance bin i 
are obtained as the iterative least-squares solution of equa¬ 
tion 0 . Two first parameters, Wi and define the 2D 
CFs. The results for the luminosity sample A are shown 
in Fig. 0 To ease the comparisons of the correlation pa¬ 
rameters in the different bins, and to indicate what linear 
distances are involved in the calculations of the space CF 
(see below), we plot the data as a function of the transverse 
comoving distance, p, rather than the angular separation, 
with p = 9 ■ Ri, where Ri denotes the distance to the centre 
of the ith bin. In all the distance bins, except for the nearest 
one, the galaxy pairs are counted in 19 separation zones in 
the range of 0.1 < p < 22.6 Mpc. In the nearest distance 
bin (650-1400 Mpc) the range of separations is reduced to 
0.1 < p < 10.0 Mpc due to the small angular size of the 
COSMOS field. 


The parameters Wi depend on the actual 3D clustering 
amplitude and on the surface density of galaxies in the con¬ 
secutive distance bins. To assess the spatial clustering one 
requires the information on the spatial density of galaxies 
in the each bin, and this question is discussed below. The 
CF slopes, (i, are not greatly affected by the bin limits. 
In Fig. [5] variations of the correlations slopes as a function 
of distance are shown for galaxies in three luminosity sam¬ 
ples. The error bars indicate 68 % confidence levels assuming 
one interesting parameter ( Avni| [l976 1 . Large £i uncertain¬ 
ties in the samples B and C make any definite account on 
the slope variations problematic. Nevertheless, two conclu¬ 
sions seem to be relatively well established. First, no obvious 
trend of slope changes with the distance is present in any 
of the galaxy luminosity samples. Second, the slope for the 
most luminous galaxies (sample A) is generally steeper than 
that for the remaining ones. The detailed interpretation of 
the present results on CF slope is discussed jointly with the 
amplitude of the spatial CF in Section 0 


4 3D CORRELATION FUNCTIONS 

We use the photometric redshifts solely to define wide dis¬ 
tance bins, and not to examine individual galaxy pairs in 
3D. The excess number of close pairs relative to that ex¬ 
pected for the random distribution is clearly visible in each 
bin. The fine 2D correlation signal shows that the statisti¬ 
cal characteristics of photometric redshifts are sufficient to 
determine the evolution of the CF slope over a wide range 
of cosmological epochs. To assess the amplitude of the space 
correlations at successive distances, one should deproject the 
corresponding surface CFs. 

The space CF gives the number of excess galaxy pairs 
relative to the local average galaxy density. In the present 
investigation the galaxy density varies between bins and also 
within each individual bin. However, these variations do not 
constrain the CF estimates. The galaxy excess given by the 
2D CF is equal to the integral of the space 3D properly 
weighted by the space density of galaxies populating the 
selected distance bin. Below the deprojection procedure is 
described in detail. 

A power law provides not only satisfactory approxima¬ 
tion to the 2D CF, but also allows for a straightforward 
assessment of the spatial correlation parameters. Under the 
standard assumption of clustering isotropy and small angles 
approximation, the spatial CF is a power law with a slope 
7 ® £ — 1. To retrieve the normalization of the spatial cor¬ 
relations from the 2D function, one needs the information 
on the radial distribution of the average galaxy density in 
the sample. In the present analysis, the galaxy concentra¬ 
tion varies systematically with the radial distance. We now 
derive the formulae relating the 3D CF to the 2D function 
and the varying galaxy density. 

The 3D CF measures the local average excess of galaxies 
relative to the average density of galaxies. Let A Ny{r) is the 
average excess of galaxies, i.e. the number of galaxies above 
the average within a volume AV at a distance r from the 
randomly chosen galaxy. The CF £(r) is proportional to the 
number of excess objects: 


AJVy(r) = AVp$(r), 


(5) 







































6 A. M. Soltan and M. J. Chodorowski 


where p denotes the average galaxy density. In the present 
data the average galaxy density is strongly varying function 
of the distance R within each bin, p = p(R). Using the 
power-law model for £ : 

« r)= (£) ■ (6) 

where r 0 is the correlation length, one can integrate the 
galaxy excess equation § along the line of sight at fixed 
transverse separation p: 

A N A (p) = AA G( 7 ) rr p(R)p C , (7) 


where A A is the surface area of volume AV projected inthe 


sky plane, and G( 7 ) = T (|) F (—2 — J T (— T) (Totsuji 
fc Kihara|1969'). In the actual calculations the maximum ra¬ 


dial separation of galaxy pairs is limited by the bin bound¬ 
aries, what reduces the G(y) factor. The amplitude of this 
effect depends on the radial distribution of the galaxy den¬ 
sity (see Sect. 4.31. In the present case, the radial depth 
of the distance bin is non-negligible in comparison to the 
distance R itself. The counts of galaxy pairs are in fact per¬ 
formed within the fixed angular separation 9 rather than the 
p. Substituting p = OR and A A = A uj- R 2 into equation (Jr]) 
we get: 

A N u (9) = Aw G( 7 ) p{R) 9 c R 7+3 , ( 8 ) 


where the subscript w indicates that counts are collected 
within the solid angle Aw. Notice that here p and 9 are 
related to the varying distance R rather than to the bin 
centre Ri. The excess AN UJ (9) averaged over all the galaxies 
found in the selected distance bin is given by: 


AN lJJ i(8) = Aw G( 7 i) r~?* 9 


f d RR J ' +5 pUR) 

J d RRipi(R) 


(9) 


The index i specifies that only galaxies assigned to the ith 
distance bin are used to count the galaxy pairs, and pi(R) is 
the average density of these galaxies at the distance R. The 
galaxy excess at the left-hand side of equation <§ is equal 
to that in equations Q and (|3| using the 2D correlations: 

AN w i(9) = Aw nio wi . (10) 


Here, n i 0 denotes the observed average surface density of 
galaxies of the i-th distance bin. The average radial density 
distribution, pi(R), in equation (J 9 ]) is the actual distribution 
that might not be adequately described by the photometric 
redshift distribution. The question of statistical reconstruc¬ 
tion of the spectroscopic redshift distribution based on the 
photometric redshifts is discussed in the next section. 

We use the term ‘spectroscopic redshift’ to denote the 
perfect distance measure based on the Hubble expansion. 
The actual spectroscopic redshifts are neither sufficiently 
accurate, nor adequately represent the Hubble flow. In the 
present consideration, the statistical relationships between 
the redshift distributions is used to assess the average distri¬ 
bution of galaxy spatial density as a function of cosmolog¬ 
ical distance. Thus, small deviations between the ‘Hubble 
flow redshifts’ and their spectroscopic counterparts are of 
no importance. One should note also that the photometric 
redshifts are used in the present calculations exclusively to 
define distance bins and not to compute galaxy separations 
in the radial direction. Thus, the construction of the 3D CF 



-0.05 0 0.05 

y = (z.-z p )/(l+z p ) 

Figure 6. The distribution of z a — z p differences in three mag¬ 
nitude bands indicated in the upper right corners. Black line his¬ 
togram - the observed distribution, the shaded area - the ML 
fit. The ordinate axis shows the number of galaxies with known 
spectroscopic redshifts. 


is not affected by the ‘redshift distortions’ that modify the 
separations between objects. 

A question of the radial distribution of the average 
galaxy density, p(R), deserves more comment. The trans¬ 
verse extension of the COSMOS field at distances 4000- 
6000 Mpc amounts to 100-150 Mpc (in comoving units), 
and even at large distances the survey area is smaller than 
typical LSSs. However, both the transverse survey size as 
well as the radial extension of the distance bins are dis¬ 
tinctly larger than the galaxy pair separations used in the 
CF estimates (nearly all the correlation signal is limited to 
projected separations below 10 Mpc). Since the average den¬ 
sity p(R) and the CF are determined using the same obser¬ 
vational material, the small-scale fluctuations defined by the 
CF are measured ‘on top’ of the LSS potentially present in 
the data. From that point, it would be desirable to compare 
our results with the deep CF assessments in other fields. 


4.1 Modelling the photometric redshifts 
inaccuracies and failures 

To determine the statistical distribution of spectroscopic 
redshifts, one needs to construct the probability distribu¬ 
tions of z p — z s differences. Let p(z s \ z p ) denote the proba- 
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Figure 7. Same as Fig. [ 6 ]for magnitude bands 22 — 23 and 23—25. 

bility that a galaxy with the assigned photometric redshift 
z p has the spectroscopic redshift z s . The expected distribu¬ 
tion of z s is then a convolution of the photometric redshifts 
with the p(z s \ z p ) probability: 


i(z s ) = J 


dz p p(z s | z p ) n(zp), 


( 11 ) 


where n(z p ) is the observed distribution of the photometric 
redshifts. Since the distribution p(z s \ z p ) depends strongly 
on the galaxy magnitude, it is modelled separately in the 
consecutive magnitude bands. A thorough discussions of the 
z s — z p deviations (e.g. Ilbert et al.|2009 Dahlen et al.|20f3l 


show that Zp errors are efficiently expressed using the param¬ 
eter x = (Zp—Zs) /(l+ 2 a ). It was found that a single Gaussian 
function adequately reproduces the probability distribution 
p(x) for the small absolute values of x, i.e. for the ‘success¬ 
ful’ z p estimates. But the p(x) has broad wings, inconsistent 
with the narrow central Gaussian peak, and for still larger 
x the probability distribution has small but quasi-constant 
amplitude. 

The objective of our calculations is to assess the z a dis¬ 
tribution using the z p data. It is natural to use in equa¬ 


tion (111 somewhat different parametrization of the z s — z p 


differences, viz. y = (z a — z p )/(l + z p ). Since even at the 
fainter magnitudes the photometric redshifts provide statis¬ 
tically robust estimate of galaxy distances, the probability 
distributions p(y) and p(x) have similar construction. Con¬ 
sequently, function p(y ) also exhibits narrow central peak, 
relatively wider wings and weak constant signal. To match 
these characteristics, we fit the analytic function that is a 
sum of three components: (a) the Gaussian - emulating the 
narrow peak near y tts 0 , (b) the resonance curve - to re¬ 
produce the contribution of larger z p deviations, and (c) the 
flat signal - reproducing the catastrophic errors. In total, 
six parameters of the p(y) distribution were determined. We 


applied the maximum likelihood (ML) estimation method. 
The detailed description of the fitted function and the whole 
procedure as well as the numerical results are presented in 
Appendix [A] 

The spectroscopic redshifts come mostly from zCOS- 
MOS Data Release ( Lilly et al.|2 007). We searched also the 
NED Database, and several redshifts have been extracted 
from Onodera et al. (20121, Bezanson et al. ( 2013| ) and |van| 
de Sande et al. (20131. The fits are shown in Figs. [ 6 ] and [7] 
where the ordinate axis gives the number of objects rather 
than the probability. The galaxy sample was divided in 5 


mag bins. In agreement with the Ilbert et al. (2009) discus¬ 


sion, the fits for galaxies brighter than 23 mag are strongly 
peaked at y = 0 , while the fainter objects exhibit substan¬ 
tially larger scatter. Nevertheless, despite a non-negligible 
number of ‘catastrophic’ z p — z s discrepancies below 23 mag, 
the central concentration is still dominant. It allows us to use 


effectively equation (111. However, a relatively small number 


of the spectroscopic redshifts in the 23-25 mag bin generates 
the easily visible noise and broadens the uncertainty limits 
of the fitted parameters. This question is discussed in detail 
below. 


4.2 Correlation length 

The spatial density of galaxies assigned to the ith distance 
class, pi(R), in equation is related to the number of 
galaxies rii{z s ) in a standard way: 

Mi?) - dR ’ (12) 

where Q is the solid angle of the survey. Set of equations 
(121 allows us to determine the spatial density correla¬ 
tion parameter r 0 for each distance bin. The galaxies in 
a given bin are spread over a range of magnitudes. There¬ 
fore, the probability distribution p(z s \ z p ) in equation (111 is 


weighted accordingly to the magnitude distribution of n;(z p ) 
galaxies. Fig. [ 8 ] shows our best estimates of the CF param¬ 
eters for class A galaxies. The error bars here are generated 
solely by uncertainties in the fitting of the power law to 2D 
CFs (Fig. [4]). The errors represent 68 % uncertainties assum¬ 
ing one interesting parameter (Avni 19761. 

A reasoning that photometric redshifts contain suffi¬ 
cient information to derive the amplitude of the space CF 
has been examined with positive results in the present in¬ 
vestigation. Using extensive simulations, we generated the 
synthetic galaxy distribution of known space CF and then 
effectively determined the CF parameters according to the 
prescription presented above. The computational details are 
described in Appendix [B] 

The simultaneous fitting of 7 and r 0 induces strong cor¬ 
relation of both parameters [j I 11 Fig.jfiJ the 7 — r 0 confidence 
regions are shown in all the distance bins for galaxies of class 
A; the data for the classes B and C are limited to 5 and 4 
nearest bins, respectively. The contours do not incorporate 
uncertainties introduced by the estimates of the distribution 


2 To be precise, we fit simultaneously three parameters ry, vy, 
and G of equation 0 Two of them are ‘interesting’, viz. Wi and 
Q. Observational correlation between w>i and Q is transferred into 
correlation between r 0 and 7 via equations |9| and 
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Figure 8. Power-law parameters: 7 — slope, and r Q - normaliza¬ 
tion, of the spatial correlation function of class A galaxies versus 
look back time. Full dots show simultaneous fittings of 7 and r Q ; 
crosses in the lower panel are the r 0 best estimates assuming the 
slope fixed at the average value 7 = —1.92. 


of the z s — z p differences, or the ML fits of p(y). To assess 
the p(y) uncertainties, the mock p(y) distributions were cre¬ 
ated using the bootstrap method. The details of the whole 
procedure are described in Appendix |A| 

We limited our analysis to the faintest galaxies because 
at the brighter magnitudes the p(y) distributions are based 
on an extensive data and the uncertainty of the p(y) fits is 
negligible. Small number of spectroscopic redshifts at faint 
magnitudes results in the poor quality of the p(y) fit in the 
23 < * + < 25 band. The galaxies in the 23-25 mag dominate 
in the distance bins 4 — 6. They constitute 71, 97 and 99.5% 
of the class A galaxies in the distance bins 4-6, respectively. 

The calculations of r 0 for the simulated p(y) probabil¬ 
ity functions were repeated as for the actual data. We find 
that the rms scatter of r 0 produced by the uncertainties of 
the ML fitting amounts to 0.16, 0.26 and 0.29 Mpc for bins 
4-6. Thus, the errors related to the ML fits are several times 
smaller than the uncertainties defined by the confidence level 
contours. Assuming that both errors add in squares, the un¬ 
certainties involved in the ML fitting do not contribute sig¬ 
nificantly to the total uncertainties. 


4.3 The radial extension of distance bins 


In Fig. |10[ we plot the radial distribution of the space den¬ 
sity of galaxies luminosity class A calculated using equations 


(111 and (12 1 . Although the effects of the photometric red- 


shift errors are easily visible, the bins are still well defined 
in the real space. Long tails of the three most distant bins 
stretching towards the lower distances result from the catas¬ 
trophic z p errors. 

It was shown in the previous section that the uncer¬ 
tainties of the p(y) fits contribute marginally to the total 
errors of our r 0 estimates. This is because the p(y) scat- 



r 0 [ M P C ] 


Figure 9. Power-law parameters of the correlation functions in 
the distance bins 1-6 (labelled in the bottom left corner); dis- 
tance/redshift boundaries are indicated in the top right corner. 
Full dots — class A, open circles — class B, triangles - class C. 
Contours show regions of 68 and 90 per cent confidence level. 


ter only weakly affects the n(z p ) —> n(z s ) transformation. 
The thin curves for three most distant bins in Fig. [To] show 
the rms uncertainty range of our n(z s ) reconstruction pro¬ 
duced by the stochastic character of the p(y) estimates. It is 
visible that statistical uncertainties strongly affect only the 
low-amplitude tails of the individual n(z s ) distributions. 

The composite shape of the pi(R) distribution is ac¬ 
counted for in the calculations of the correlation length r 0 . 
One should notice that the contribution to the r 0 amplitude 
of the low-density extensions (that suffer from large uncer¬ 
tainties) is small as compared to the ‘central’ section of the 
the bin. This is because the integral at the right-hand side 
of equation 0 contains the density square. 

Relationships between the measured 2D CF and the 3D 
CF derived using equations |9]l and (101 should be corrected 
for the finite radial extensions of bins. However, because the 
bin depths are substantially larger than the expected maxi¬ 
mum separations at which the CF significantly deviates from 
zero, the effect is small as compared to statistical uncertain¬ 
ties. It was assessed as follows. 

Counting galaxy pairs in the chosen distance bin corre¬ 
sponds to the double integration along the line of sight of the 
3D CF weighted by the actual radial galaxy space density 
distribution. The difference between the results based on the 
equation |t]) and the pair number counts depends on the a 
priori unknown parameters of £(r). Since this difference is 
expected to be small in the present calculation, we expressed 
the amplitude of the finite bin depth effect a posteriori, tak¬ 
ing the best-fitting £(r) parameters as determined in the 
previous sub-section. In all the bins, the relative difference 
between the analytic value used in equation 0 end the ‘ef- 
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Figure 10. Thick curves - numbers of class A galaxies in six 
distance bins (indicated by labels 1-6) corrected for the z p er¬ 
rors using equation Thin curves - la uncertainty regions 

resulting from uncertainties of the z s — z s fits (see Appendix 
A for details.); shown only for three most distance bins where 
the errors are more pronounced. The upper envelope shows the 
summed galaxy numbers. The distance bins partially overlap be¬ 
cause galaxies are assigned to their bins based on the photometric 
redshift. 


fective’ G-factor is negligible at the small separations, rises 
to ~ 1 per cent at the transverse separations of 1-3 Mpc, 
and reaches 5-8 percent (depending on the bin number) at 
~ 20 Mpc. The expected impact of this ‘integral constraint’ 
on the CF slope 7 would be of the order of ~ 0.02 with the 
correspondingly small effect for the correlation length. 


5 DISCUSSION AND CONCLUSIONS 


The mean absolute magnitude of galaxies in the sample A 
coincides roughly with the characteristic magnitude M* in 
the Schechter luminosity function. Because of the flux se¬ 
lection, the mean magnitude difference between the samples 
A-C varies with the distance. In the first three distance bins, 
galaxies in the sample B are fainter than class A galaxies by 
~ 1.4 mag, while in the distance bins 4 and 5 this difference 
drops to 1.3 and 1.2, respectively. The magnitude separation 
between sample A and C is more strongly affected by the 
selection. It amounts to 2.9 mag in the first two distance 
bins, and is reduced to 2.7 and 2.3 in the next two bins, 
respectively. 

A comparison of our CF slope estimates with a number 
of fragmentary measurements present in the literature leads 
to somewhat ambiguous conclusions. Marulli et al.| (2013) 
compile a number of recent results on the CF parameters 
derived for a wide range of galaxy luminosities and redshifts. 
Our results fit well to the general distribution of measure¬ 
ments in their fig. 5. In particular, the slope flattening in 
the sample B and C as compared to the sample A seems 
to be present also in the published results (e.g. |Pollo et al.| 
2006 Coil et aLl|2006 1 . One should note, however, that in¬ 
dividual measurements refer frequently to different redshifts 
and are subject to large uncertainties. Therefore, questions 
on the specific relationships between CF parameters still re¬ 
main open. The VIPERS data (Marulli et al. 20131 that 


suffer from the smallest uncertainties, indicate a weak but 
systematic flattening of the slope with redshift, while this ef¬ 


fect is not indicated by our investigation. Furthermore, these 
data show only a marginal CF slope - absolute magnitude 
dependence. One should note, however, that the VIPERS 
galaxies in the Marulli et al. (2013) data span a relatively 


narrow magnitude range of AM : 
redshifts between 0.5 and 1.1. 


1.5, and are confined to 


Despite considerable size of the banana shape confi¬ 
dence regions in Fig. [9] our estimates of 7 and r 0 pro¬ 
vide constraining information on the cosmic evolution of 
the galaxy CF. Although, the relative positions of the best¬ 
fitting parameters in the r D —) plane of class A-C galaxies 
vary from one distance bin to the another, the shape and ori¬ 
entation of the confidence regions demonstrate some perma¬ 
nent characteristics of the CF over a wide redshift range. It 
appears that the correlation length, r 0 , of class A-C galaxies 
is quite similar in most of the distance bins, although not 
necessarily identical. Larger discrepancies between class B 
and A in the distance bin 2 and 5 are of the opposite sign, 
and seem to be generated by statistical fluctuation in the 
data rather than the cosmic signal. The conclusion in the 
previous section that the CF slope is steeper for the most 
luminous galaxies (class A) is strengthen by the overall lay¬ 
out of the confidence regions. 


Marulli et al. (2013) reach the opposite conclusion. 


They find ‘a monotonic increase in the clustering length 
r 0 ’ as a function of magnitude ‘in all three redshift ranges 
considered’, i.e. 0.5-0.7, 0.7-0.9 and 0.9-1,1. Taking into 
account elongated shapes of the confidence regions, such in¬ 
terpretation of Fig. [9] is not completely ruled out; however, 
it is not favoured by the present results. 


It is instructive to compare our results with the |Arnalte-| 
Mur et al. (20141 investigation, as it also uses photomet¬ 
ric redshifts, and covers a relatively wide redshift range, 
0.35 < 2 < 1.25. Their fig. 7 apparently indicates a rise of 
the correlation length with galaxy luminosities, but no ob¬ 
vious dependence of the correlation slope on the luminosi¬ 
ties. Despite these differences, both investigations support 
the conclusion that the clustering amplitude increases with 
the galaxy luminosities. Our Fig. [9] shows that the present 
calculations are not immune to statistical fluctuations. The 
question of systematic errors is more ambiguous. It is as¬ 
sumed in the present method that all the known as well 
as unrecognized selection effects that alter the galaxy dis¬ 
tribution are accounted for using the information incorpo¬ 
rated in the catalogue itself. In the alternative methods, the 
mock catalogues are used as a reference ‘random’ distribu¬ 
tion. Such catalogues are designed to model all the instru¬ 
mental/observational constraints that modify the genuine 
galaxy clustering. A question which of these two methods is 
more suitable for dealing with various observational biases 
depends on the particular properties of the observational 
data. The overall characteristics of the COSMOS/Subaru 
catalogue, namely very wide redshift coverage of relatively 
small surface area with numerous gaps, favour the method 
developed in this paper. 


Because of the apparently constant slope of the CF for 
the class A galaxies, it is reasonable to fix the slope for all the 
bins at the average value of 7 = —1.92. The best estimates 
of r 0 in this case are shown with crosses in the lower panel 
of Fig. [ 8 ] There is a weak indication that r 0 slowly increases 
with the cosmic time. The dotted line in the lower panel 
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shows the least-squares linear fit to the r 0 versus look back 
time. The slope of this line differs from 0 at 1.14 ct. 

The issue whether the present results are representative 
for the general galaxy clustering properties or refer just to 
the COSMOS field cannot be answered using this field alone. 
The question is legitimate because of some peculiarities in 
the field were reported in the literature (e.g. jMeneux et al. 


2009). We would postpone the answering to this until the 


similar analysis is performed on other deep fields. 


5.1 The bias 

The homogeneous computation scheme applied to the galax¬ 
ies spread over a huge redshift range allowed us for a uniform 
assessment of the clustering evolution as well as the galaxy 
bias, b(z): 

b(z) = a g (z)/cr m (z), (13) 



where <7 g (z) denotes the galaxy number rms fluctuations and 
a m (z) the mass rms fluctuations within the same volume. 
The CF is rigorously connected to the galaxy rms fluctua¬ 
tions. In particular, the rms of the galaxy number in a sphere 


of radius r is related to the power-law CF (Peebles & Groth 


19761: 


a s (r\z) = [C 7 £(rjz)] 1/2 , 


(14) 


where C 7 = 72 • 2 7 /[(3 + 7 ) (4 + 7 ) (6 + 7 )], and the vol¬ 
ume size as well as the redshift dependence of a and £ 
are explicitly indicated. The growth of the matter fluctu¬ 
ations, <r m (rjz), in the linear regime is completely defined 
by the cosmological model. In the following, we use the 
set of equations listed by[Meneux et al. (2008): <r m (rjz) = 
<Tm(r|0)/.D(z) , where D(z) = g(z)/[g( 0) (1 + z)] and g(z) = 

| f2 m j |Y2m ' — Ha + (1 + n m /2) (1 + £Ta/ 70)J is the nor¬ 
malized growth factor ( |Carroll et al.| [l992). The evolving 
density parameters S 2 m and Oa are related to theirs present 
epoch values: 

EE On (Z) = , n A EE fi A (*) = ^ , (15) 


where E 2 (z) = 0a |O + (1 + z) 3 is the expansion factor 

for the flat cosmological model, for which Qa + = 1. The 

present epoch mass fluctuations amplitude, <r m ( 0 ), is com¬ 
monly normalized to the mass rms within a sphere of radius 
r = 8 ft - 1 Mpc, where h = 77 o /100kms _ 1 Mpc _1 . We use 
the Komatsu et al. (20111 figure cr m (0) = 0.82 based on 7 
yr WMAP observations. The bias-redshift relations deduced 
from the CF measurements are shown in Fig. m Dots and 
crosses indicate the bias variations of the class A galaxies, 
circles - class B, and triangles - class C. Growth of 6(z) for 
the most luminous (class A) galaxies reported in the liter¬ 


ature for low and moderate redshifts (e.g. Marinoni et al. 
|2005| [Papagcorgiou et al. 2012) continues to high redshifts. 
The data for galaxies class B and C are limited to low red¬ 
shifts and apparently the bias amplitude does not depend 
on the galaxy luminosities in that area. It is difficult to as¬ 
sess whether the bias coincidence at low redshift of the class 
A-C galaxies contradicts the well documented relationship 
between the bias and galaxy luminosities. This is because 
a sharp rise of the bias amplitude with galaxy luminosities 


Figure 11. The redshift evolution of the linear bias factor. Dots 
denote the bias factor calculated for the r 0 and 7 fits of class A 
galaxies indicated with dots in Fig. [ 8 ] The error bars correspond 
to 68 % uncertainties drawn in Fig. |!)| Crosses show the bias of 
the class A galaxies assuming the slope of the correlation function 
fixed at 7 = 1.92 (crosses in Fig. |§|. The bias factor for B and C 
classes are shown with open circles and triangles, respectively. 


(e.g. Zehavi et al. 2011) is mostly constrained to galaxies 
brighter than M *, while our class A corresponds roughly to 
M*, and classes B and C are fainter. 

In low and moderate redshifts the present results are 
largely consistent with the measurements based on the 
galaxy samples with spectroscopic redshifts. The photomet¬ 
ric redshifts provide only the statistical constraints on the 
galaxy distribution, and cannot be used to localize precisely 
individual objects. This inevitably broadens the uncertain¬ 
ties of our estimates. However, the photometric redshifts 
can be determined in massive scale over much larger vol¬ 
ume of the universe than it is now feasible for the spectro¬ 
scopic surveys. The present calculations are based purely 
on the observational data, i.e. in the calculations we do not 
include mock galaxy catalogues conceived from the cosmo¬ 
logical simulations. This additionally increases our final un¬ 
certainties. However, our method is more adequate as long 
as effects of the cosmic variance on the galaxy clustering are 
not fully accounted for. 

Our high-redshift measurements of the galaxy linear 
bias fit well in the low-redshift limit to the previous studies. 
To assess more precisely the evolution trends of the galaxy 
CF at high redshifts, we plan to perform analogous investi¬ 
gation on other deep galaxy photometric redshift surveys. 
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APPENDIX A: ANALYTIC FITS OF THE 
z p - DEVIATIONS 

Distribution of z p — z s differences is represented using the 
probability distribution p(y), where y = (z s — z p )/( 1 + z p ). 

Complex nature of the z p — z s deviations requires ad¬ 
equate functional form that accounts for a high fraction of 
almost perfect z p measurements as well as casual errors and 
sporadic failures. Accordingly, the p(y) distribution is de¬ 
fined as the normalized sum of three components: the Gaus¬ 
sian function g(y), the resonance function r(y) and a low 
amplitude uniform probability a 0 : 

p(y ) = «g • g{y) + a T • r{y) + a 0 , (Al) 

where 

9(y) = v^- e ex p[-(?/ - . . 

r{y) = k (y -^ + s? ■ [ ’ 

Parameters /i g , s g , p r and s r plus relative contributions of 
the components, a g , a r , and a 0 are fitted to the observed 
distributions of the y parameter. The ML estimation was 
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Table Al. Parameters for the p(y) distributions in 5 mag bins. 


Mag bin 

N, 

a 

a Q 

“g 

Pg 

% 

Q r 

fl r 

S r 

18-20 

407 

0.0028 

0.3368 

-0.0045 

0.0042 

0.6604 

0.0028 

0.0039 

20-21 

1050 

0.0085 

0.5423 

-0.0006 

0.0059 

0.4492 

0.0009 

0.0049 

21-22 

2521 

0.0120 

0.1942 

0.0037 

0.0110 

0.7938 

0.0004 

0.0045 

22-23 

3775 

0.0239 

0.3300 

-0.0008 

0.0044 

0.6461 

0.0039 

0.0094 

23-25 

89 

0.1773 

0.1886 

-0.0034 

0.0026 

0.6341 

0.0042 

0.0298 





Simulations 





23-25 

Average 

0.2090 

0.2482 

-0.0037 

0.0036 

0.5428 

0.0096 

0.0256 


rms 

0.0968 

0.1110 

0.0015 

0.0033 

0.1710 

0.0132 

0.0138 


a Number of galaxies with measured spectroscopic redshifts used in the fittings. 


applied. Since the quality of the photometric redshifts de¬ 
teriorates with increasing magnitude, the data were divided 
into 5 mag bands. A small number of measured spectroscopic 
redshifts below m = 23 and a high fraction of z p catastrophic 
errors potentially could introduce relatively large uncertain¬ 
ties to the present analysis. The question to what extent 
uncertainties related to the fitting procedure affect our es¬ 
timates of the space correlation amplitude is addressed as 
follows. 

We generated 200 quasi-random samples of the (z a , z p ) 
pairs using the bootstrap method. The mock samples were 
drawn from the real data. The average amplitudes of pa¬ 
rameters fitted to the simulations and their rms scatter have 
been calculated. The results of the entire procedure are listed 
in Table Al. The first five rows give the best-fiting param¬ 
eters to the real data while the bottom two rays show the 
results for the simulated 23-25 mag bin. 

Because of the distinctive form of the fitted function 
the parameters are strongly correlated. Therefore the resul¬ 
tant p(y) distributions exhibit a moderate variations, despite 
the high scatter of individual parameters. Stable character 
of p(y) distribution generates via equation (111 well-defined 
spectroscopic redshift distributions, n(z s ). We assessed the 
uncertainties of n(z s ) from a spread of the mock distribu¬ 
tions. A set of 200 simulated spectroscopic redshift data, 
n(z s ), was generated using the simulated p(y) probability 


functions. The rms scatter of n(z s ) is indicated in Fig. 10 
with thin curves for three most distant bins. 


APPENDIX B: TESTING THE ESTIMATORS 
OF CF PARAMETERS 

The full procedure of measuring the 3D CF parameters us¬ 
ing the catalogue of photometric redshifts has been tested on 
the simulated data. First, we generate a 3D point distribu¬ 
tion according to a priori defined statistical characteristics. 
Then, the data are processed in the same way as the COS¬ 
MOS observations to retrieve the CF parameters. Finally, 
the results are confronted with the original values. 


B1 Modelling the space distribution with definite 
CF 

The objective was to construct the model distribution of 
points characterized by the power-law CF with the slope 


and normalization close to those determined for the COS¬ 
MOS data. This synthetic material was not devised to im¬ 
itate other statistical properties of the real galaxy popula¬ 
tion. From among a diversity of space distributions satisfy¬ 
ing the selected CF, we took a computationally manageable 
position-dependent function. The particular point distribu¬ 
tion was realized by the MC method. The points were drawn 
according to the properly defined probability distribution. 
The algorithm that provided an acceptable approximation 
for the power-law CF was constructed as follows. 

Two populations of points are distributed within a spec¬ 
ified volume: (a) points concentrated in ‘clusters’]^] and (b) 
‘field’ points distributed randomly, but outside clusters. The 
cluster centres are distributed fully randomly. Points within 
each cluster are distributed according to the broken power 
law: 


»oir(*r 

for r < n 


* (*r 

for n ^ r < V2 

(Bl) 

pb 

for r ^ r2 , 



where pb is the space density of points outside clusters, n 
and V 2 are characteristic cluster radii that delineate two 
zones of distinct power indices ai and 012 ■ The amplitude 
of the CF depends also on the number of clusters or equiv¬ 
alently - on the fraction of volume occupied by clusters. In 
the subsequent calculations, we adopted the following pa¬ 
rameter values: radii r i and r 2 of 6 and 16 Mpc, respectively, 
and the power-law indices ai = —2.44 and 02 = —1.25 for 
the central and outer cluster zones. The volume occupied by 
clusters was selected at x « 0.40 of the total survey volume. 
This set of parameters determines the relative contribution 
of field and clustered points to the total number at 0.33 and 
0.67, respectively. 

The present prescription of the space point distribu¬ 
tion yields the CF emulating the power law over a wide 
range of separations extending up to ~ 15 Mpc. This is il¬ 
lustrated in Fig. |B1| where ~ 62000 points were distributed 


in a volume of 1.73 x 10 7 Mpc 3 according to equation (B1 1 
with pb = 2.25 x lCP 3 Mpc -3 . Fitting the power law to 


3 Here, ‘clusters’ represent clamps of points that generate the 
required CF, with no relation to the real galaxy clusters. 
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r [Mpc] 

Figure Bl. Dots - the correlation function of 62158 points dis¬ 
tributed according to the model described in the text, line - 
the power-law fit £(r) = (r/r m ) 7m with r m = 5.89 Mpc and 
7m = -1-89. 

the CF generated by this set of parameters gives the slope 
7 m = —1.9 and the correlation length r m = 5.9 Mpc. 

It is important to emphasize that this model distribu¬ 
tion generates the definite CF but has no resemblance to 
the actual galaxy space distribution. The objective of the 
procedure is just to test the efficiency of the method, i.e. 
whether the algorithm presented in Sections [3] and [4] is able 
to extract the parameters of the CF from the surface data. 
One can expect that individual realizations of the synthetic 
model limited to the volume of the COSMOS held would 
provide the CF parameters scattered around the original 
values assumed in the simulations. The size of this spread 
is characteristic to the model and does not represent uncer¬ 
tainties of the CF parameters derived from the real data. 

B2 Extracting the PL parameters from the model 
data distributed in the COSMOS field 

We now adjust our model to the global constraints that 
shape the COSMOS data. The objective is to reproduce 
some basic statistical characteristics (such as the total num¬ 
ber of objects and a form of the radial distribution) of the 
most luminous galaxies, denoted as class A. 

First, the points are distributed in a pyramid volume of 
the solid angle of 1.42x 1.38deg 2 , what corresponds to the 
size of the COSMOS/Subaru field. To emulate a decreasing 
space density of galaxies at large distances, the probability of 
qualifying a point for further processing was adequately re¬ 
duced. In effect, the final set of ‘objects’ contained ~ 76000 
points. We declare that the present method is essentially 
insensitive to the numerous empty regions distributed over 
the field as well as to the smooth, large angular scale inho¬ 
mogeneities of the survey. Although variations of the lim- 
magnitudes have been investigated (Taniguchi et al. 

, and the final catalogue is free from substantial inho¬ 


mogeneities, one should allow for some residual instrumen¬ 
tal fluctuations that simulate the clustering signal. To check 
that, we ran the tests with a superimposed mask that imi¬ 
tates holes in the actual catalogue due to the bright stars. 
The ‘Swiss cheese’ mask contains above 125 circles repre¬ 
senting the most prominent blank fields. Additionally, we 
test the effects of potential angular inhomegeneities of the 
survey. This is achieved by applying the ‘fluctuation filter’, 
F a . We superimpose a filter that introduces fluctuations of 
the point surface density with a characteristic wavelength 
A = 40arcmin in both directions (RA and Dec.): 

n( x, y) = n [1 + a sin(kx) sin(fcy)] (B2) 

where n(x, y ) and n are the local and the mean point 
densities, respectively, a is the fluctuation amplitude, and 
k = 27t/40 arcniin. Three amplitudes a were applied: 0.025, 
0.05 and 0.10 

‘Spectroscopic redshifts’ were assigned to all the objects 
assuming the cosmological model adopted in the paper. To 
attach a ‘photometric redshift’ to each object, we apply a 
method identical to that described in Appendix [A] Using 
the ML method, we find the best-fitting parameters of the 
p{x) probability distribution, where x = (z p — z s )/(l + z s ). 
Since the overall shape of p(x) and p(y) is similar, the func¬ 
tional form of both distribution is the same. The COSMOS 
p(x) distributions depend on the apparent magnitude of ob¬ 
jects. To mimic this effect, each object in our catalogue was 
labelled with the ‘apparent magnitude’ drawn from the mag¬ 
nitude distribution of the real data. 

The final simulated catalogue contained the list of 
points identified just by their angular coordinates and pho¬ 
tometric redshifts. Such ‘observational material’ was anal¬ 
ysed using the method and formulae in Sections [3] and [4] 
Calculations were performed for 25 sets of simulated cata¬ 
logues, using various combinations of the bright star mask 
(BS) and fluctuation filters (F0.025, F0.05 and F0.10). The 
quantitative comparison of the CF parameter estimates, 7 
and r 0 , with their model amplitudes of —1.89 and 5.89 Mpc 
is summarized in Table |Bl) In columns 2-11, the average 
amplitudes for 7 and r 0 of 25 data sets are shown. In the 
last two columns, the rms scatter averaged over all the mask 
+ filter settings is given. 

We note that, in fact, the mask and filters do not in¬ 
troduce any systematic corrections to our estimates of the 
CF parameters. One could expect a moderate raise of uncer¬ 
tainties because both modifications add an additional noise 
to the data. However, the effect is small and the total scat¬ 
ter is dominated by the stochastic nature of the simulated 
catalogues. In Fig. |B2| the distributions of 25 runs with the 
BS mask and the fluctuation filter F 0.05 are shown in the 
r 0 ~'y plane. It appears that the distributions of both pa¬ 
rameters are roughly centred on their original amplitudes. 
Although, some bias towards flatter slopes and larger cor¬ 
relation length is visible in distance bins that contain lower 
number of objects. 

One should notice that the extent of the scatter plots 
in Fig. |B2| results from the statistical character of the inves¬ 
tigated matter. That includes also particular distribution 
of clusters in the simulated catalogues. Therefore, the pa¬ 
rameter dispersion derived for the synthetic catalogues is a 
sum of two components. First, it is generated by the noise 
associated with the drawing of points according to the spe- 
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Table Bl. Parameter estimation applied to the simulated, synthetic catalogues of photometric redshifts. 


Distance 

No mask 

BS 


BS + Fo.025 

BS + 

Fo.05 

BS + 

Fo.io 

rms 


bin 

7 

r a 

7 

r a 

7 

r a 

7 

r a 

7 

r a 

7 

r a 

1 

-1.87 

6.6 

-1.87 

6.5 

-1.89 

6.3 

-1.84 

6.8 

- 1.88 

6.4 

0.22 

2.2 

2 

-1.87 

6.1 

- 1.86 

6.3 

-1.87 

6.1 

-1.87 

6.2 

-1.87 

6.2 

0.10 

1.1 

3 

- 1.88 

6.1 

-1.87 

6.1 

- 1.88 

6.1 

- 1.88 

6.1 

-1.87 

6.1 

0.11 

0.9 

4 

- 1.86 

6.3 

- 1.86 

6.3 

-1.87 

6.3 

- 1.88 

6.2 

- 1.86 

6.3 

0.09 

0.9 

5 

-1.87 

6.0 

-1.90 

5.8 

- 1.88 

5.9 

-1.89 

5.9 

-1.90 

5.9 

0.15 

1.1 

6 

- 1.88 

6.2 

-1.91 

6.1 

-1.90 

6.1 

-1.90 

6.1 

-1.89 

6.2 

0.21 

1.6 


Notes. The simulated data points were drawn according to the underlying probability distribution that generates the power-law CF 
with 7 = —1.89 and r 0 = 5.9Mpc. BS - bright star mask, Fo.o 25 > Fq.os and Fq.iq - filters; see text for details. 
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Figure B2. Estimates of the power-law parameters of the cor¬ 
relation functions for 25 sets of simulated data with the ‘bright 
star’ mask and ‘fluctuation filter’ Fo.os- The input CF parameters 
are indicated by large full dots; parameters extracted using the 
present method are shown with small dots. See text for details. 


cific underlying probability distribution. Second, it results 
from the different realizations of probability distribution it¬ 
self. Thus, the scatter plots in Fig. |B2| do not represent the 
uncertainties of parameters estimated from the real data. 

A good agreement between the input parameters and 
their estimates indicates that the procedures introduced in 
the paper can be used to derive the space characteristics 
of the CF using photometric redshifts. Possible systematic 
deviations, if any, are substantially smaller than large sta¬ 
tistical uncertainties. 






